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' A Cramer moderate deviation theorem for Hotelling's T^-statistic 
00 , is proved under a finite (3 + S)th moment. Tfie result is applied to 
' large scale tests on the equality of mean vectors and is shown that 

Li^ . the number of tests can be as large as e°'" ' before the chi-squared 

rx^ ' distribution calibration becomes inaccurate. As an application of the 

• , moderate deviation results, a global test on the equality of m mean 

'"Ti ■ vectors based on the maximum of Hotelling's T^-statistics is devel- 

C^ ' oped and its asymptotic null distribution is shown to be an extreme 

C , value type I distribution. A novel intermediate approximation to the 

' '■ null distribution is proposed to improve the slow convergence rate 

I of the extreme distribution approximation. Numerical studies show 

^~^ . that the new test procedure works well even for a small sample size 

^z" ' and performs favorably in analyzing a breast cancer dataset. 

oo ; 

^~~' . 1. Introduction. Consider the followine; m simultaneous tests: 

(N . 

Tt ! (1-1) Hoi:tJ-ii = tJ-2i versus i?ii : /x^^ / /Xgj 

O ■ 

^r^ . for 1 <i <m, where /x^j and fX2i ^^^ di > l-dimensional mean vectors, and 

^-H I di are uniformly bounded. When di = 1, the multiple testing problem (1.1) 

^ ■ has been extensively studied. A common statistical method is the two sam- 

l^ . pie i-test together with multiple comparison procedure by controlling the 

\^ ' familywise error rate (FWER) or the false discovery rate (FDR). The theo- 
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retical justification of tfiis metliod can be found in Fan, Hall and Yao (2007). 
Although not much attention has been paid to the multivariate case di > 1, 
(1.1) has arisen from several important applications including shape analysis 
of brain structures and gene selection. 

• Shape analysis of brain structures. There is a growing interest in statisti- 
cal shape analysis within the neuroimaging community; see Styner et al. 
(2006), Zhao et al. (2008), Gerardina et al. (2009). Styner et al. (2006) de- 
veloped a widely-used software to locate significant shape changes between 
healthy and pathological brain structures. The final and most important 
step in Styner et al. (2006) procedure is the simultaneous testing of (1.1) 
with /Xj^j and ^2i being mean vectors of 3 coordinates of surface points. 
The number of tests m can be hundreds or even thousands and dj = 3 
for all i. In Styner et al. (2006), two sample Hotelling's T^-statistics T^^ 
were used for each HQi and Benjamini-Hochberg procedure was used to 
control the FDR. 

• Gene selection. In the breast cancer dataset analyzed by Martens et al. 
(2005), every gene corresponds to a two to six-dimensional vector that 
represents the DNA methylation status of CpG sites. Dimension di is be- 
tween 2 to 6. In Martens et al. (2005), two sample Hotelling's T^-statistics 
and Benjamini-Hochberg FDR correction were used to identify the sig- 
nificantly different genes between two patient groups. 

It is well known that Hotelling's T^-statistic is asymptotically chi-squared 
distributed when the underlying distribution has a finite second moment. 
This provides a natural way to estimate p- values. In the "large m small n" 
statistical analysis, the true p-values are typically small, of order 0{l/m) in 
FDR procedure. A basic question is: 

with how many tests can the chi-squared distribution cahbration be applied 
before the tests become inaccurate? 

As discussed in Fan, Hall and Yao (2007) and Liu and Shao (2010), the 
question can be answered with Cramer-type moderate deviation results. The 
moderate deviation behavior for t-statistic is now well-understood, however, 
a Cramer type moderate deviation theorem for Hotelling's T^-statistic is still 
not available. The main purpose of this paper is to establish the moderate 
deviation theorem for Hotelling's T^-statistic (one-sample and two-sample). 
We shall prove that under a finite {3 + 5)th moment, Hotelling's T^-statistic 
T^ satisfies 



P(x2(d)>x2) 

uniformly for x G [0,o(n^'^)). Consequently, the number of tests can be as 

large as e°'" ' before the chi-squared distribution calibration becomes in- 
accurate; see (2.2). 
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As an application of the moderate deviation result, we consider the global 
testing 

Hq : Hi^ = fj,2i fo^ ^11 1 <i <m against 
(1.2) 

Hi : ^n 7^ ii2i for some i. 

In shape analysis of brain structures with di = 3, the global test (1.2) is often 
used to determinate whether two brain shapes between two groups of sub- 
jects are different or not; see Cao and Worsley (1999), Taylor and Worsley 
(2008). In gene selection [Martens et al. (2005)], (1.2) has been used to test 
whether the endocrine therapy is effective on DNA methylation status. Here 
we are particularly interested in the alternative hypothesis that the locations 
where ^i^ ^ fi2i ^^^ sparse. For example, in the brain structures, the shape 
differences are commonly assumed to be confined to a small number of iso- 
lated regions inside the whole brain. In this paper, we shall propose a testing 
procedure based on the maximum of Hotelling's T^-statistics. The proposed 
test procedure shares several advantages. It is quite robust to the tails of 
the underlying distribution and the dependence structure. It converges to 
the given significance level with a rate of y^(logm)^/n. A numerical study 
shows that the test procedure works quite well even for small samples. 

The rest of our paper is organized as follows. In Section 2, we state Cramer 
moderate deviation results for Hotelling's T^-statistic. In Section 3, we in- 
troduce our test procedure for the global test (1.2). Theoretical results of 
the robustness on the tails and dependence structures are given. The power 
of the test procedure is also investigated. A numerical study is carried out 
in Section 4, in which we compare our test procedure to some existing test 
procedures. The proofs of the main results are postponed to Section 5. 

2. A Cramer type moderate deviation theorem for Hotelling's T^-statistic. 

The properties of Hotelling's T^-statistic under normality are well known 
[Anderson (2003)]. Large and moderate deviations (logarithm of the tail 
probabilities) were obtained in Dembo and Shao (2006). In this section, we 
shall establish a Cramer moderate deviation theorem for Hotelling's T^- 
statistic. For Student i-statistic, the Cramer moderate deviation result was 
first obtained by Shao (1999) under a finite third moment and the result 
was extended to self-normalized sums of independent random variables in 
Jing, Shao and Wang (2003). We refer to de la Peha, Lai and Shao (2009) 
for a systematic presentation on the self-normalized limit theory and its 
statistical applications. 

Let {Xi, . . . ,X„^} and {Yi, . . . , Y^g} be two groups of i.i.d. (i-dimensional 
random vectors with mean vectors Hi and H2 ^''^^ covariance matrices Si 
and S2, respectively. Assume that {Xi, . . . ,X„-^} and {Yi,...,Y„2} are 
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independent and Si and S2 are positive definite. Let 

X = — y X/j, I = — y Y/j 
'- k=i ^ k=i 

be the sample means and 
V„,i = — V(Xfc - X)(Xfc - X)', V„2 = — E(Yfc - Y)(Yfc - Y)' 

be the sample covariance matrices, where for a vector a, a' denotes its trans- 
pose. The two sample Hotelling's T^-statistic is then defined by 

r„2 = (X-Y)'f— V„i + — V„2l (X-Y). 

Let ni X 712 denote the inequality ci < ni/n2 < C2 for some positive constants 
ci and C2. The following result gives a Cramer type moderate deviation for 
Hotelhng's T^-statistic. 

Theorem 2.1. Suppose that rii >cn2, E||Xi|p+'' < 00 and E||Yi|p+'^ < 
00 for some 6 > 0. Then, under /i^ = /X2 

. , P(r2 > 2;2) 

(2.1) ^, „"„ TTv -^ 1 as n-^ 00 

^ ^ P(x2(d)>x2) 

uniformly for x G [0, 0(72^' ^)), where n = ni + n2- 

Theorem 2.1 shows that the true distribution of T^ can be well approx- 
imated by x^(^) distribution uniformly in the interval [0,o(n^'^)) under 
the finite (3 + 6)th. moment. Let Fn{x) = P(T^ > x\fXi = ^12) ^-^d F{x) = 
P(x^(d) > x). Then, the true p-value is pn = Fn{T^) and the estimated 
p-value is pn = F{T^). Thus by (2.1), 

(2.2) ^_i/{p„>e-°(«^^-')} = o(l). 

Pn 

This provides a theoretical justification of the accuracy of the estimated p- 
values by the chi-squared distribution used in B-H FDR correction method. 
We refer to Fan, Hall and Yao (2007) and Liu and Shao (2010) for more 
detailed discussion on the relations between the Cramer type moderate de- 
viation and the accuracy of the estimated p- values used in large scale tests. 
For one-sample Hotelling's T^-statistic, we have a similar result. 

Theorem 2.2. Suppose that E||Xi|p+'^ < 00 /or some 6>0. Then 



uniformly for x G [0, 0(72-1^ )). 
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The proof of Theorem 2.2 is completely similar to that of Theorem 2.1 
and so will be omitted. 

Remark 2.1. As proved by Shao (1999) and Jing, Shao and Wang 
(2003), (2.1) and (2.3) hold under finite third moments when d = 1 and 
the range [0,o(n^'^)) is the widest possible. We conjecture that (2.1) and 
(2.3) remain valid for d > 2 under a finite third moment and that the range 
[0, o(n^'^)) is optimal. 

3. Global testing. In this section, we are interested in the global testing 
(1.2), that is, 

Hq : fii^ = fj.2i foi' all 1 < i < ?n against 

Hi : ^n / ^i2i for some i. 

where ^i^ and /.i2i ^^^^ dj-dimensional mean vectors of random vectors X* 
and Y*, respectively. 

Write a = (/^n, . . . , At'im) ^^id b = {^l'^2l■, ■ ■ ■ , M2m)- Most of existing works 
on the global tests are focused on the alternative that a — b is either sparse 
or dense. When the alternative is sparse, the commonly used test statistic is 
the maximum of univariate t-statistics and the higher criticism (HC*) test 
procedure [Donoho and Jin (2004), Hall and Jin (2010)]. On the other hand, 
if the signals are dense, then the squared sum type test statistics have been 
used [Chen and Qin (2010)]. In this section, we focus on the sparse alter- 
native hypothesis. The main difference between the current paper and the 
previous works is that the sparse signals appear in groups and that the un- 
derlying distributions are not necessarily normal and the components may 
not have an ordered structure. For the sparse case, it has been proved in 
Donoho and Jin (2004) that the higher criticism statistic enjoys some op- 
timal properties with respect to the detection region. On the other hand, 
the independence between variables plays an important role in the control 
of type I errors of the higher criticism statistic. The simulation in Section 4 
shows that HC* statistic may not be robust against the dependence and 
may fail to control the type I error. In contrast, our test procedure intro- 
duced below is robust to dependence, as shown by Theorems 3.1-3.4 and 
the simulation. 

Suppose that we have two groups of i.i.d. observations 

X = {Xl...,X^;l<k<ni} and y = {YI,. . . ,YJ^;1 < k <n2} 

with mean vectors {fJ-u, . . . , fJ-im} ^^^ {/^2i) • ■ • ) t^2m}^ respectively. The two 
groups of observations X and y are independent. Let T^^ be the two sample 
Hotelling's T^-statistics based on {X^; 1 < A; < ni} and {Y|; 1 < k < 712}. 
We introduce our test procedure as follows. 
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Case 1. di = d. Let Wi^^, 1 < A; < ni, and W2,fc, 1 < k < n2 be i.i.d. 
multivariate normal vectors with mean zero and covariance matrix I^. Let 

(3.1) F„„„,(y) = P(r:2>y), 

where T^ is the two sample Hotelling's T^-test statistic based on {Wi ^j 
and {W2^fc}. For given < a < 1, let yn{oi) satisfy 

(3.2) exp(-mF„i,„2(y„(a))) = 1-a. 

Note that 1 — i^ni,n2(?/) is closely related to F distribution. In general, we 
can use simulation to obtain yn{a). Our test procedure for (1.2) is <^* , where 

(3.3) K = l{in^Tl>yn{a)]. 

The hypothesis Hq is rejected whenever <l>* = 1. 

Case 2. di may he different. Let -Fni,n2,di(2/) be defined as in (3.1) with d 
being replaced with di. Let Gn^^n-zAiv) = 1 - Fni,n2,di{y)- We now define 

^l = ^|,m.ax G„i,„2,d,(r2.) > gm{a)\ 

with gm{o) = 1 + r7T,~^log(l — a). The hypothesis Hq is rejected whenever 
^i = l. Note that $3, = $* if di = d. 

Remark 3.1. By Theorem 3.1, maxi<j<mT'^j converges to the extreme 
I type distribution. It seems natural to define the following test <I>q,: 

(3.4) $a = -?'|max T^^ > 21ogm + (d- 2)loglogm + g„|, 

where qa = — 2 log(r((i/2)) — 2 log log(l — q)~^. The hypothesis Hq is rejected 
whenever $„ = 1. However, it is well known that the rate of convergence to 
the extreme distribution is very slow [see Liu, Lin and Shao (2008)]. On the 
other hand, the intermediate approximation given in Theorem 3.3 can sub- 
stantially improve the convergence rate. This leads to our test procedure <I>* . 
Numerical results in Section 4 show that $* outperforms ^a significantly 
and it works well even when the sample size is small. 

3.1. The limiting distribution o/ maxi< j<m Z^j • In this subsection, we 
show that the type I error of $* will converges to a under some mild moment 
conditions and dependence structure. To this end, we need to establish the 
limiting distribution of maxi<j<m T^i under Hq. Let Sj = Sji + ^Sj2 , where 
Sji and Sj2 are the covariance matrices of X* and Y*, respectively. Define 

Vij = Sri/2 J'cov(X^X^■) + ^ Cov(Y^ Y-')') ST^/^ 
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The matrix Tij characterizes the dependence structure between {X*,Y*} 
and {X-^, Y-'}. For example, when ni = ?i2 and S^i = 5]j2, 

r,,- = i Cov(5]-//'x^ ST^/'X^) + i Cov(I]r//2Y\ SJa'/'Y-') 

is the sum of two matrices. When d = 1 and Sji = Sj2 5 then Fjj = piji, 
where piji is the correlation coefficient between X* and X-'. For < r < 1, 
let 

A(r) = {1 <i <m: \\Tij\\ > r for some j ^ i}, 

where || • || is the spectral norm. A(r) is a subset of {1,2, . . . ,m} in which 
{X*,Y*} can be highly correlated with other random vectors. Let Ri = 
(rjji) and R2 = (?"ij2) be the correlation matrices of the random vectors 
((X^)', . . . , (X™)') and ((Y^)', . . . , (Y™)'), respectively. For some 7 > 0, let 

Sj{m) = Cardjl <i<m: \riji\ > (log?7i) '^ or |rjj2| > (logm) '^}. 

We need the following condition on the dependence structure. 

(CI) Suppose that Card(A(r)) = o{m) for some < r < 1 and 

max Sj(m) = O(m^) 
i<j<P ■' 

for all p > 0. Assume that mini<j<p{Aniin(Si)} > r for some r > 0, where 
Amin(S.j) is the smallest eigenvalue of 5]j. 

The dependence condition (CI) is mild. In (CI), o(rn) vectors {X*, Y*}, 
i G A(r), can be highly correlated with other random vectors. Every {X*, Y*} 
can be highly correlated with Si{m) vectors and weakly correlated with the 
remaining vectors. The dependence in (CI) is more general than "clumpy 
dependence" [Storey and Tibshirani (2001)] and may be a more realistic 
form of dependence in DNA microarrays. See also Hall and Wang (2010) 
who noted that short-range dependence, and more specially, /c-dependence 
structure, are often observed in DNA microarrays. 

The next condition is on the moment of the underlying distributions and 
the relation between the sample sizes and dimension m. We assume that m 
is a function of n = ni + 77-2 and m — t- 00 as n — ?• 00. 

(C2) Suppose that maxi<i<^ E(||X*f +'' + ||Y*f +-5) < k for some k>0 
and (5 > 0, ni x 722 and logTTi = o[n^'^). 

Theorem 3.1. Under Hq, di = d, (CI) and (C2), we have asn^-oo, 

r I max ± 
\l<i<m 

(3.5) 



P I max T^j — 2 log m+ {2 — d) log log m <y 

\l<i<m 



"^Pl-rx^M^"'' 



for any y & R. 
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It follows from Theorem 2.1 that 

y„(a) =21ogm+ {d - 2)\oglogm + q^ + o{l), 
which together with Theorem 3.1, yields the following theorem. 

Theorem 3.2. Under Hq, di = d, (CI) and (C2), we have as n—?-oo, 

(3.6) P{K = l)^»- 

Remark 3.2. When di are different, we have a similar result as Theo- 
rem 3.2. Under Hq, (CI) and (C2), we have as n — )• oo, 

(3.7) Pi<^i = l)^a 

for any < a < 1. The proof of (3.7) is similar to that of Theorem 3.1 and 
hence will be omitted. 

As mentioned earlier, the convergence rate of (3.5) is very slow. In test- 
ing diagonal covariance matrix problem, Liu, Lin and Shao (2008) proposed 
to use an intermediate approximation and proved that the rate of conver- 
gence can be of order of y^{logm)^/n. Here we give a similar intermediate 
approximation to the distribution of maxi<j<mX^j- 

Let @j be the set of indices such that T^ ■ is independent with {T'^{,i € Qj) 
and put Sj{ni) = m — Card(6j). 

(CI*) Suppose that Card(A(r)) = 0{m^) for some < r < 1 and < ^ < 
1. Assume that maxi<j<m Sj{m) = 0{mP) for some 0<p< (1 — r)/(l+r). 

(C2*) Suppose that maxi<i<„ E(||X* 113+5 + ||Y*f+<5) < k for some k>0 
and (5 > 0, ci < nijn^ < c^ for some ci > and C2 > and logm = o{ri^^'^\ 

(C3*) Suppose that Xlij = Yiii for 1 < i < m. We assume that X* and Y* 
can be written as the transforms of independent components: 

X^ = 5]}fZH + //i, and Y^ = S^f Zsi + /^si, 
where EZij = 0, Cov(Zii) = I and EZ2J = 0, Cov(Z2i) = I and the compo- 
nents in Zij and 7^2% are independent. 

(CI*) is a technical condition. It allows T^- be dependent with 0{mP) 
others. By (CI*), we can use the Poisson approximation in Arratia, Gold- 
stein and Gordon (1989). (C3*) is also required for technical reason. It can 

be avoided if we assume that maxi<j<m Ee"" 1"^" 1"^ < k for some t > 0. 

Theorem 3.3. Under Hq, di = d, (C1*)-(C3*), we have for any e > 

sup P( max Tli <y) - exp(-mF„,_„2(y)) 

(3.8) 

< c(^p^^ + mP-i^-r)/i^+r)+^ + ^C-1 log m 
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where -F'ni,n2(y) ^s defined in (3.1) and C is a finite constant depending only 
on ^, r, p, 6, K, e, c\ , C2 and d. 

If ?n, > cin for all 6 > 0, then the error rate in Theorem 3.3 is of order 
■^/(log m,)^/n. By Theorem 3.3, we can get the following result. 

Theorem 3.4. Under Hq, d-i = d, (C1*)-(C3*), we have for any e > 0, 



sup I P($; = l)-a\<C (.[^^^ + ^p-(l-r)/(l+r)+. ^ ^5-1 log^ 

o<Q<i \\ n 

where C is given in (3.8). 

3.2. Power result for <l>* . Here we consider the power of the test $* , 

Theorem 3.5. Suppose that 



11^-1/2/ Ml ^ / (2 + e)logm 
max II S. {^lli-^l2i)\\>\ 

for some e > 0. Then under (CI) and (C2), 

P($* = 1) -> 1 asn-^oo. 

Theorem 3.5 shows that, in order to reject the null hypothesis correctly, 

1/2/,, ,, Ml ^ /(2+€)log-m 



we only require maxi<j<m ||S- (/x^j - /Z2JII > \ - — ' ^ . The optimality 



711 



of this lower bound when d=l can be found in Cai, Liu and Xia (2012). We 
believe this lower bound remains optimal for d>2 under some regularity 
conditions. 

4. Numerical results. 

4.1. Simulation. In this section, we examine the numerical performance 
of the proposed tests $* with d = 3. We first compare $* with $„ to see 
the improvement of the intermediate approximation and then compare $* 
to the higher criticism (HC*) test procedure [Donoho and Jin (2004), Hall 
and Jin (2010)], the test procedure proposed by Chen and Qin (2010) (C-Q) 
and the univariate i-test procedure based on maxi<j<^mtf (U-T), where tj 
is the two sample t-statistic based on the ith coordinates of the observations. 
The higher criticism test statistic is defined as Hall and Jin (2010) 

: max JVpi^MV 

.•:l/.<P(.)<l/2l^p^^.^(l_p(.^)J 

where q = 3m, pj = P(| A^(0, 1)| > |tj|) and p^) is the jth p- value after sorting 
in ascending order. There are also other versions of HC* statistics [Donoho 



HC* = max 
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and Jin (2004)]. They perform similarly in our numerical studies. The crit- 
ical values an with significance level 0.05 are taken to be the solutions to 
P(HC* > On) = 0.05 under that pj, I < j < 3m, are i.i.d. uniform (0, 1) dis- 
tributed random variables. 
Let 

((xi)',...,(x™)') = (zi,...,z3™)x 5:1/2, 

((Yi)',...,(Y™)') = (Zi,...,Z|-)xSV2 

be 3?TT.-dimensional random vectors with covariance matrix S, where {Zf} 

are i.i.d. random variables. We consider four distributions of Z^, A^(0,1), 
t(5), exponential distribution with parameter 1 (Exp(l)), and Gamma dis- 
tribution with shape and scale parameters (2,2) (Gamma(2,2)). The covari- 
ance matrix S is taken to be: 

(1) Si = (0.9lJ"*l); 

(2) S2 = (o"jj), where aij =max{l — \j — i\/{0.1 * (3m)), 0}; 

(3) S3 = {(7ij), where aij = max{l — \j — i|/(0.8 * (3?ti)),0}. 

Si is an approximately bandable matrix. S2 is a 0.3m sparse matrix 
which has 0.3m nonzero entries in each row. In S3, the number of nonzero 
entries in each row is 2.4m and the dependence between the variables be- 
comes stronger than that in S2. 

The sample sizes (ni,n2) are taken to be (6,12), (12,24), (24,48) and m 
takes values 50,100,200,400. We carry out 5000 simulations to obtain the 
empirical sizes with nominal significance level 0.05. The results for S = Si 
are summarized in Table 1. The simulation results when S takes the other 
covariance matrices are stated in the supplement material [Liu and Shao 
(2013)] due to limit of space. We can see that the empirical sizes of <1>* 
and Chen and Qin's test are close to 0.05. $* still performs well when the 
dependence becomes stronger (S = S2 and S3). However, the empirical 
sizes of $a suffer very serious distortions. This indicates the intermediate 
approximation in Section 3 gains a lot of improvement on the accuracy of 
controlling type I errors. The test procedure <1>* is robust to the tails of 
distributions and the dependence. On the other hand, the empirical sizes 
of HC* are much larger than 0.05. This shows that HC* statistic may be 
not robust to the dependence. We have also done additional simulations and 
found that, when the variables are independent but not normally distributed, 
HC* statistic may suffer serious distortions from the nominal significance 
level. 

To evaluate the power, we consider both approximately sparse model and 
dense model. Let fi^ = for 1 <i <m. Set /x = (/Ui, . . . ,H3m) = E((Yi)', . . . , 
(Y'")') and a^ = Var(Zi). Consider 

Model 1 {approximately sparse case). Let fii = (—0.2)*"^ x 2y/a'^logm/n2 
for 1 < i < 3m. 
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Table 1 
Comparison of empirical sizes with nominal significance level 0.05 ('S = Sij 







iV(o,: 


L) 






t(5) 






m \ (ni, n^i) 


(6,12) 


(12,24) 


(24,48) 


(6,12) 


(12,24) 


(24,48) 


50 


4>: 


0.0516 


0.0466 


0.0430 


0.0412 


0.0374 


0.0404 




$a 


0.8965 


0.4760 


0.2285 


0.8641 


0.4312 


0.2078 




HC* 


0.5986 


0.4348 


0.3514 


0.6028 


0.4438 


0.3534 




C-Q 


0.0634 


0.0644 


0.0632 


0.0646 


0.0660 


0.0644 


100 


$: 


0.0558 


0.0483 


0.0508 


0.0423 


0.0360 


0.0442 




-I>c 


0.9694 


0.5799 


0.2711 


0.9542 


0.5315 


0.2364 




HC* 


0.7584 


0.5228 


0.4260 


0.7460 


0.5334 


0.4100 




C-Q 


0.0606 


0.0620 


0.0626 


0.0642 


0.0614 


0.0592 


200 


$: 


0.0602 


0.0584 


0.0515 


0.0464 


0.0393 


0.0420 




$a 


0.9958 


0.7045 


0.3238 


0.9916 


0.6380 


0.2783 




HC* 


0.9072 


0.6492 


0.4920 


0.8986 


0.6438 


0.4672 




C-Q 


0.0624 


0.0584 


0.0600 


0.0566 


0.0570 


0.0574 


400 


$: 


0.0636 


0.0609 


0.0495 


0.0464 


0.0402 


0.0406 




$« 


1.0000 


0.8198 


0.3781 


0.9996 


0.7571 


0.3253 




HC* 


0.9840 


0.7876 


0.5660 


0.9814 


0.7820 


0.5642 




C-Q 


0.0552 


0.0592 


0.0604 


0.0508 


0.0580 


0.0588 






Exp(l) 






Gamma(2,2) 


50 


*; 


0.0355 


0.0392 


0.0450 


0.0403 


0.0468 


0.0451 




$a 


0.8441 


0.4294 


0.2226 


0.8675 


0.4473 


0.2291 




HC* 


0.5950 


0.4492 


0.3584 


0.5924 


0.4370 


0.3604 




C-Q 


0.0628 


0.0622 


0.0688 


0.0580 


0.0728 


0.0666 


100 


$: 


0.0404 


0.0372 


0.0519 


0.0436 


0.0414 


0.0524 




$« 


0.9409 


0.5230 


0.2625 


0.9557 


0.5521 


0.2725 




HC* 


0.7502 


0.5296 


0.4188 


0.7640 


0.5352 


0.4212 




C-Q 


0.0620 


0.0626 


0.0644 


0.0664 


0.0582 


0.0598 


200 


*: 


0.0408 


0.0364 


0.0498 


0.0481 


0.0435 


0.0551 




$. 


0.9882 


0.6355 


0.3105 


0.9923 


0.6671 


0.3196 




HC* 


0.8910 


0.6358 


0.4806 


0.9042 


0.6538 


0.5014 




C-Q 


0.0602 


0.0608 


0.0630 


0.0570 


0.0556 


0.0610 


400 


^l 


0.0460 


0.0355 


0.0517 


0.0478 


0.0449 


0.0529 




^c 


0.9987 


0.7430 


0.3671 


0.9997 


0.7810 


0.3693 




HC* 


0.9766 


0.7788 


0.5768 


0.9838 


0.7916 


0.5762 




C-Q 


0.0570 


0.0590 


0.0568 


0.0518 


0.0544 


0.0572 



Model 2 {dense case). Let //j = 0.2(— l)*"-*^ x 2 a/ cr^ log m/n2 for 1 < 
i < 3m. 

Because of the serious distortion of empirical sizes of $„ and HC*, we do 
not consider the power of $a and HC*. We only report the power results for 
the normal distributions due to the high similarity of the results with other 
distributions. The reject region for maxi<j<^mt? is [yn{a), oo) with d = 1 in 
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Table 2 
Comparison of empirical powers (Y, - 



SJ 







Model 1 






Model 2 






m \ (ni, n2) 


(6,12) 


(12,24) 


(24,48) 


(6,12) 


(12,24) 


(24,48) 


50 


K 


0.7343 


0.9327 


0.9758 


0.9453 


0.9959 


0.9994 




C-Q 


0.0755 


0.0739 


0.0755 


0.1369 


0.1343 


0.1404 




U-T 


0.0766 


0.0938 


0.1064 


0.0901 


0.0890 


0.0862 


100 


K 


0.7489 


0.9538 


0.9880 


0.9943 


1.0000 


1.0000 




C-Q 


0.0704 


0.0733 


0.0720 


0.2201 


0.2250 


0.2295 




U-T 


0.0713 


0.1001 


0.0921 


0.1019 


0.1137 


0.0875 


200 


K 


0.7451 


0.9635 


0.9937 


0.9998 


1.0000 


1.0000 




C-Q 


0.0761 


0.0665 


0.0705 


0.4289 


0.4365 


0.4303 




U-T 


0.0719 


0.1058 


0.0945 


0.1278 


0.1507 


0.1160 


400 


K 


0.7520 


0.9696 


0.9957 


1.000 


1.0000 


1.0000 




C-Q 


0.0633 


0.0634 


0.0636 


0.7701 


0.7997 


0.8007 




U-T 


0.0703 


0.1089 


0.0951 


0.1414 


0.2062 


0.1467 



Fni,n2{y) and y„(Q) satisfying 

exp(-3mF„^,„2(y„(a))) = 1 - a. 

This gives a much more accurate approximation than the extreme distribu- 
tion (results will not be reported here). 

In Table 2, we only state the results when S = Si. The other simulation 
results are given in the supplement material [Liu and Shao (2013)]. Note 
that in model 1, n||/x|p/?n^/^ — )• 0. The power of Chen and Qin (2010) is 
low, as shown in Table 2. The power of maxi<j<dmif is also quite low. 
Our test statistics $* has the highest powers which are close to one for 
(ni,n2) = (12,24) and (24,48). In the dense case model 2, our test statistics 
still has the highest power. We should remark that no method can uniformly 
outperform others over all models and there may exist certain situations 
where Chen and Qin's (2010) test statistic may outperform ours. 

4.2. Real data analysis. We apply the test procedure in Section 3 to test 
whether the tamoxifen therapy is effective on the promoter DNA methyla- 
tion status of 117 genes. The dataset consists of 123 patients, who showed 
the extreme types of response to tamoxifen treatment; they either had an 
objective response (CR-I-PR, 45 patients) or a progressive disease right from 
the start of treatment (PD, 78 patients). There are 117 genes and each gene 
corresponds to a 2-6-dimensional vector that represents DNA methylation 
status of CpG sites analyzed using a microarray-based DNA methylation 
detection assay. Martens et al. (2005) used the Benjamini-Hochberg (B-H) 
FDR procedure with the target FDR of 25% to identify genes whose pro- 
moter DNA methylation status was associated with the clinical benefit of 
tamoxifen therapy. Before using B-H FDR procedure, it is interesting to test 
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whether the tamoxifen therapy is effective on the promoter DNA methyla- 
tion status of those genes. 

For each gene, we calculate the Hotelling's T^-statistic T^j. The given 
significance level is a = 0.05. The value of maxi<j<mG'ni,n2,d,(^ni) i^ 1-0000 
which is larger than 1 + m~^log(0.95) = 0.9996. Thus, we can accept at 
the 0.05 significance level that the tamoxifen therapy has an effect on the 
promoter DNA methylation status. We found three genes, PSATl, STMNl 
and SFN, whose values of Gni,n2,di{Tni) are larger than 0.9996. These three 
genes were also identified by Martens et al. (2005) who used B-H FDR 
correction and the x^ distributions. 

5. Proof of main results. 



5.1. Proof of Theorem 2.1. Without loss of generality, we assume that 
/^i = M2 = 0- Since T^ converges to a chi-squared distribution with d degrees 
of freedom, we have for any M > 

P(r2 > x2) 



lim sup 

"^°°0<x<A/ 



P(x2(d)>x2) 

Thus, there exists a sequence a„ — )■ oo such that 



(5.1) 



lim sup 

"-5'°O0<x<a„ 



P(x2((i)>x2) 



0. 



0. 



Let S = Si + ^Sa and 



Zfc - 
By the identity 




fc— ni 



x' A x = max 



1 < A; < ni, 

rii + 1 < k < ni -\- n2. 



=1 e'Ae 

for any dx d positive definite matrix A, where ^ is a d-dimensional vector, 
we have 

n 

{Tl>x^] = l3e,,.i. 



>x. 



Y,{0"^kf -ni{6'Z^f -n2{e'Z2f 



\ k=l 

where n = ni + n2, Zi = ;^ Y2=i Zfe and ^2 = ^ ELm+i Zfc- Theorem 2.1 
follows if we can prove that 



(5.2) 



P(3g, s.t. ||g|| = 1, 1 Efcgg^^Zfcl > x^T.keH{(^"Z'k) 
P(x2((i)>x2) 
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uniformly for X G [a„,o(n^'^)), H = {1,2,. . . ,n}, {l,2,...,ni} and {ni + 
1, . . . ,n}. In fact, (5.2) implies that, for z = 1,2, 

P(3g, s.t. Il^ll = 1, \e%\ > 2nT^x^Y.l=i{Q"^k)^) 

P(x2(d)>4x2) ^ 

uniformly for x G [a„,o(n-^'^)). Observe that 



"1 



< P 36*, s.t. 116*11 = 1, \e'Zi\ > 2n{^x 

+ P ( 36*, s.t. ||6i|| = 1, \d'%\ > 2n2^x 
+ P 3^, s.t. ||6 



\ k=\ 






1+1 



1 



En nirj I 



'(EL.(^'z.)^)^-^^'-'^'""-'^'^^"^''0 



(2 + o(l))P(x=^(d)>4x2) 
+ P(30, s.t. 11^ 



IV^". nirj I 



o(l)P(x'(rf)>x2) 
+ P(3^, s.t. 11^11 = 



1, 



(ELi(^'Zfc)2)i/2 



En nIrj I 



.2„-l_/1^2„-Ul/2 



(EL,(«'z.-P)"'^ -'*'"'' "■"'"'^ 



uniformly in x G [a„,o(n^'^)). Similarly, we can obtain a lower bound for 
P(r2 > x^), which together with (5.1) and (5.2) yields (2.1). 

We only prove (5.2) with H = {\,2,. . . ,n\. The proof for the other two 
cases is similar. Let 3/(3 + (5) < /5 < 1, Z,t = Zfc/{||Zfc|| < (y^/x)^} and set 

n n 

fe=l fc=l,fc^N 

n n 

fc=l,fc^N 



fc=l 



Vn(^)=E(^'z,)^ viN}(0)= E ( 



fc=l 
n 



^'Zfc)^ 



3/-^ \2 



v„(0) = E(^'Z'c)', viN^(e)= E (^'Zfc)'' 

fc=l fc=l,A,'^N 
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where N is an index set. By the fact that [see (5.7) in Jing, Shao and Wang 
(2003)] 



(5.3) {s + t> x\lc + t2} c {s > (x^ - l)^/^^/c} 

for any s,t £ R, c > and x > 1, we have 



P{39, S.t. Il^ll = 1, \Sn{9)\ > Xy^V;^) 



< P(30, s.t. Il^ll = 1, \Sn{e)\ > xa/ V„(^)) 



(5.4) +Y.P{39, s.t. Il^ll = 1,15^(0)1 > V^c^^^/Y\'\e),Aj) 



p(30, s.t. \\9\\ = i,\Snm>x\^^nm 



+ J]P(30, s.t. Il^ll = l,\Si^He)\ > V^^^yJvi'\9))P{A,), 



where 



A, 



\Zj\\>{y/n/x)'^} forl<i<n. 



Repeating (5.4) and inequahty (5.3) m times, we get 



9(39, s.t. Il^ll = 1, |5„(^)| > xVV„(0)) 



< 9(39, s.t. Il^ll = 1, |5„(^)| > x^Yn{9)) + J^ C>z + C/^+i, 



1=1 



where 



n n 






np(^^- 



.fe=i 



X P(3^, s.t. Il^l 



1, \si^''-^'H9)\ > Vx^-i\/^rl^''-'''\e)) 



and 



n n -m+l 

jl = l jm + l = l fc = l 



Let m = [^2/2] for x > 4. We have 



m+l 



C/^+i=^P(||Zfc||>(V^/a 



(5.5) 



^fe=i 



<e-"'^°^i" = o{l)Pix'^id)>x), 
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where 

The proof of (5.2) now rehes on the Cramer-type moderate theorem for 
self-normahzed truncated variables given below. 

Proposition 5.1. Assume that Card(N) = 0(x^). Then we have 



p{3e, s.t. \\e\\ = 1, \si''H9)\ > x^Jiri''\e)) 

(5.6) 

= {l + oil))P{xHd)>x^) 

uniformly in x ^ [a.„,o(?i"'^'^)). 

The proof of Proposition 5.1 will be given in the next subsection. Let us 
now finish the proof of (5.2). 

Using the same arguments as in the proof of inequality (5.5) and by 
Proposition 5.1, we have 

m m 

Y.^i< CY,nx\d) >x^- /)exp(-aogg„) 
1=1 1=1 

= o{l)P{x\d)>x'') 

uniformly in x G [a„,o(n^'^)). Hence, 



p(3^, s.t. ii^ii = 1, \Sn{e)\ > x^/VJe)) < (1 + o{i))P{x\d) > x^) 

uniformly in x € [on,o(n^'^)). To establish the lower bound, we note that 

P(30, s.t. Il^ll = 1, \Snm > X^/SUO)) 



> P(30, s.t. Il^ll = 1, \Sn{e)\ > xa/ V„(^)) 



- Y, P(3^, s-t- ll^ll = 1, \si'Ho)\ > V^c^^^Jvi'^{e))P{A,). 

It follows from Proposition 5.1 again that 

P{39, s.t. Il^ll = 1, \Sn{e)\ > x^/V;M) > (1 + o{l))P{x\d) > x2) 

uniformly in x G [an,o{n^'^)). This completes the proof of (5.2) and hence 
Theorem 2.1. 

5.2. Proof of Proposition 5.1. We start with the Cramer type moderate 
deviation theorem for non-self-normalized sum. 

Lemma 5.1. Lei Card(N) = O(x^). We have 

p{3e, s.t. \\e\\ = 1, \s^r^\e)\ > x^m) = (i + o(i))P(x'(d) > x") 

uniformly in x ^ [4, o(n^'^)). 
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To prove Lemma 5.1, we need the following lemma by Lin and Liu (2009). 
The definition \ ■ \d below is a slightly different from that in Lin and Liu 
(2009), but the proof is exactly the same. 

Lemma 5.2. Let S,n.ii ■ ■ ■ , £,n,kn ^^ independent random vectors with mean 

zero and values in R^, and Sn = '^i=iin,i- Assume that ||^n,«|| < CuBJ , 
\<i< kn, for some c„ — )■ 0, i?„ — )■ oo and 

\\B~^ Cov(e„,i + • • • + Cn.fcJ - hW < Cocl, 

where I^ is a d x d identity matrix and Cq is a positive constant. Suppose 
that Pn := Bn X]i=i E||^n,«|P — ^ 0. Then for all n > no (n^ is given below) 

\P{\Sn\d>x)-P{\NU>x/Bl/^)\ 

<o(i)p(|ivu>x/sy2) 

uniformly for x G [BJ ,(^„min(c^^,/3„ )Br! ], with any 5„ — )■ and 
(5„min(c~^,/3„ ) — t-oo, where N is a centered normal random vector with 
covariance matrix Id; \ ■ \d denotes \z\d = min{||xj|| :1 <i < d/q}, z = (xi, . . . , 
x^/q), Xj G W^ and d/q is an integer; o(l) is hounded by An := A{6n + (3n), 
A is a positive constant depending only on d; 

no = min{n : V/c >n,cl< Cqi , 4 < C02 , A < Qs} , 

where Cqi, C02 and C03 are some positive constants depending only on d 
and Cq. 

Proof of Lemma 5.1. Let ^nk = Z^ — EZ^, Bn = ni and c„ = 
2n^ {y/n/x)^ in Lemma 5.2. By the inequahties /? > 3/(3 + 5) and x = 

0(71^/6), 



^n^Cov ^ink -Id 



\k=l 



< C max E||Zfcf I{||Zfc|| > {y/^/xf} 

l<k<n 



< Cix/V^)^'+'^^ < Ccl 
By letting 5„ — )■ sufficiently slow, we have 

uniformly in x G [4:,o{n^'^)). This proves Lemma 5.1. D 
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Proof of Proposition 5.1. Observe that 



Pi39, s.t. \\9\\ = l,\Si''H0)\>xJiri^\9)) 



< P(30, s.t. Il^ll = 1, 15^^(0)1 > xJniil - e„x-2)) 



+ p(3e,s.t. \\e\\ = l,\si''\e)\>x^J\2'Ho),Eni0)) 

and 



p(30, s.t. \\6\\ = 1, \si^He)\ > x^\l;\e)) 



> p(30, s.t. ii^ii = 1, \si^He)\ > xJmii + e„x-2)) 



-P(30, s.t. \\e\\ = l,\Si^\9)\>x^ni{l + enX-^),Fnie)), 
where e„ — >• which will be specified later and 

En{e) = {vl!'He)<m{i-enx~^)}, 
Fn{e) = {iri^H9)>ni{i + enx-^)}. 

Also note that 

9(39, s.t. \\9\\ = 1, |5W(0)| > ^^/^) = P{\si^}\^ > x^) 
with q = d. By Lemma 5.1, we have 



9(39, s.t. Il^ll = 1, |SlN}(0)| > x^ni(l ± e„x-2)) = (i + o(l))P(x'(d) > x^) 

uniformly in x E [a„,o(n^'^)). So it suffices to prove the following lemma. 

D 

Lemma 5.3. Let Card(N) = 0{x'^). We have 



9(39, s.t. \\9\\ = 1, \Si^\9)\ > x^Jvi''\9),En{9)) 
(5.7) 

= o{l)P{x\d)>x^) 



and 



P{39, s.t. \\9\\ = l,\Si^H9)\>xJni{l + enX-^),Fn{9)) 

(5.8) 

= o{l)PixHd)>x^) 

uniformly in x ^ [an,o{n^'^)). 
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Proof. We only prove (5.7) because the proof of (5.8) is similar. Let 
b = xj ^Jnl. Then for < e„ < 1/2, 



c {2hs'i'\e) - b'vj^He) >x'- slE^ie)} 



U{Si''He)>x^Y2'\e),2xb^JY2'\9)<b^vi''^e) + x'-elEr,{e)}. 

We can choose rid points 9j, 1 < J < ?^d, with ||^j|| = 1 and rid ^ n , such 
that for any \\9\\ = 1, \\6 — 6j\\ < Cn~'^ for some 1 < j < rid. So we have 

p( U {2bSi''H0)-b'VJ^He)>x'-elE4e)} 
11^11=1 






' n-,\ 



nd 

< J2 Pi'^bsi'^He,) - b\vi^H9,) - Evr>(^,)) 

+ t(EvW(0^.)_viN}(0.)) 

> 2x2 -el- n^'^ - 0{nb^) + tniEnX'"^ - 0{ntb)) 

Let t = {x/y/nf-'^ with < 7 < /3(1 + 5) - 1 and max{(xV"')'^''^,«n^^^} < 
£n — > 0. We use Corollary 5 of Sakhanenko (1991) to bound Ij. Let 

ik = 2berZk - 2&E0^.Zfc - (62 _ t){{e]tkf - E(^;.Zfc)2), A; ^ N. 

Then \ik\ = 0(1), Bl = T^k^N ^H = ^x'^ + 0{l)nb^, and for any bounded /i, 

L{h) = ^E|efc|3max{e^«^l} = 0(l)n6^ 

where 0(1) are bounded by some absolute constants. Let 

yn{x) = 2x2 - £n- n{^ - 0{nb^) + tniEnX'"^ - 0{ntb). 

By Corollary 5 of Sakhanenko (1991) and direct calculations, we obtain that 

Ij = (1 - a>(y„(x)/i3„))(l + O(xVV^)) 

= 0(l)x-i exp(-x2/2 - {n/xy'^) 
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uniformly in x G [a„,o(n^'^)). Hence, it follows that 

II^INi 
(5.9) 

= o{l)P{x\d)>x^) 

uniformly in x G [on,o(n^'^)). 
Observe that 



{5^(0) > xJYi''\e),2xbJYi''\9) < 62v{N}(0) + ^2 _ 4,i^„(^)} 



(5.10) c{S^f'\e)>xJvi''\9),b''\l^He)>x'' + enX,E.^{e)} 



u{si''He)>x\/iri^\e)yiri^H9)<x^-enx,En{e)}. 

By Lemma 5.1, 



P( U {s2'H0)>x^Jvi''\e),b^vi''}{9)>x^ + e^x,Enie)} 
11^11=1 

< P( U i^n'^HO) > ^{x^ + enx)n,}) 
\\e\\=i ^ 

= {l + 0{l))P{x\d)>X^ + £nX) 

= oil)P{xHd)>x^) 

uniformly in [a„, o{n^'^)) for any a.„ — )■ oo. For the second term on the right- 
hand side of (5.10), 



P U {Si''He)>x^Jvi''\9),b'\i''H9)<x'-enX,E.^{9)} 



<j:^([J{si''Ho)>x^vi''\e), 

k=i ^\m=i 
(5.11) 

viN}(0) e [ni(l - Enik + l)/x), ni(l - Snk/x)]} 

+ P( U {Vi''\0)<m{l-ej2)}\. 
\\e\\=i ' 

For the last term above, we use the Bernstein inequality and obtain 
Pf U {Vi^^(^)<m(l-W2)} 
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<Y.n^i''H0,)< mil- en/2) + n-') 

rid 

< Y, P(EViN>(^i) - V^^>(^,) > ni(.„,/2 + 0{x/V^))) 



< 



ni(e„,/2 + 0(x/V^))2 



exp 



" 26-2/3 + 46-2/3(^^/2 + 0(x/VH))/3, 
= o(l)P(x'((i)>x2) 

uniformly in [a„, o(n^' ^)). For the first term in (5.11), as in the proof of (5.9) 
using Corollary 5 of Sakhanenko (1991), we can show that 



P [J {si''Ho)>xyJvi''He), 



l|eil=i 



Vi^He) G [ni(l - Snik + l)/a;),ni(l - Enk/x)]} 



<P( U {Si''HO)>x^m{l-en{k + l)/x), 

'^i^H0)<m{l-enk/x)} 

<P( U {bSi''H9) + t{EVi^H9)-Yi''H9)) 

l|s||=i 

> xJni{l — £n{k + l)/x) + nitEnk/x + 0{ntb)} J 

< CndX~ exp(— X /2 — CQX~'^'n?' Sn) 
= o{l)P{x\d)>x^) 

uniformly in [a„,o(n^'^)). This completes the proof of Lemma 5.3. D 

5.3. Proof of Theorem 3.1. Let x„ = (21ogm + (d — 2)loglog?Ti + x)"'^'^. 
Note that by Theorem 2.1, 

P ( max T^, > x2 ) < C Card(A(r))m-i = 0(1) . 
It suffices to prove that 
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Since Card(A(r)) = o{m), without loss of generality, we can assume that 
A(r) = 0, that is, maxi<j<j<m W^ijW < f for some r < 1. Otherwise, we only 
need to replace maxi<j<m(-) below by maxi<j<^ j^yv(r)(') ^^^^ the proof re- 
mains the same. As in the proof of Theorem 2.1, we set 

. (S'^^'^XI, l<k<ni, 

and use the same truncation notations as in the proof of Theorem 2.1. With 
a careful check of the proofs of Theorem 2.1 and Proposition 5.1, we can see 
that it suffices to show that, for Card(N) = O(x^), 



HN}||^^ /„ VI ^. ^-2^^ I 1 



(5.12) P(^^max^||5V ^11 >2;„yni(l±e„x„ )j ^ exp( -— — — exp(-x/2) 



Let Hn = Xn\J ni{\ ±enXn ), where e„ — ;■ to be specified later. By the 
Bonferroni inequality, we have for any fixed integer k, 

2k 
1=1 l<ii<-<i;<m 

<p(max||5if^||>y„) 

2fc-l 

<E(-i)'"^ E n\\s;^^>yn,.:,\\sSH>yn). 

1=1 l<ii<-<ii<m 

Theorem 3.1 follows from the following lemma. 

Lemma 5.4. Let Card(N) = 0{x'^). We have for any fixed I, 

V pni^^^^ii >v II ^^^^11 > 7/ ^ 



l<ii<-<i;<m 



1/1 ^' 



<^ + ''<'»ITliW) "'■''-"''" I' 



In fact, by Lemma 5.4, we have 

limsupPf max ||5„j || > yn 

n^oo \l<«<m 
2k-l 



1/1 \ ' 
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as A; — )• oo. Similarly, 

This proves Theorem 3.1. 

Proof of Lemma 5.4. Let X^ = (Xf, . . . ,X^)' and Y* = {¥{, . . .,Yjy. 
Put 

rij = max|max|Corr(X^ X^ )|,max|Corr(yfc\,y^^ )|| 

and 

I = ll <ii < ■ ■ ■ <il <m: max rjj.j. > (logm)~ ~'^ >. 

When Z = 1, we let X = 0. For 2 < j < / - 1, define 

Ij = {1 <ii < ■ ■ ■ <il <m: Card(S) = j, where S is the subset of 

{h, ■ ■ ■ ,ii} with the largest cardinality such that Vi^ ^ it £S, 

rikk <(logm)~^"'^}. 
For j = 1, define 

Zi = {1 <ii < ■ ■ ■ <ii <m: ri^i^ > (log m) '^ for every I < k <t <l}. 

It follows from the definition of Xj that I = |J Z^Xj. Then, by (CI), we have 
Card(X,) = 0{m^+^'^P''). Define 

I"" = {I < h < ■ ■ ■ < ii < m} \I. 

We have Card(X^) = C^ - 0{m^-^+^'^P^) = (l + o(l))C^. For (n, . . . , i;) G I", 



lcov((5if,...,5ii;>))-/., 



< C;(logm)~^~T + C(logm/n)(^+^)^/2_ 

Til ■■ ■""' ■""' ■■ 

By Lemma 5.2, the proof of Lemma 5.1 and some tedious calculations, 

^VlPmi II ^ yri) • • •) IPnii W—Iln) 

= (1 + o(l))P(||W,, II > W^/m, . . . , II Wi, II > y„/Vm), 

where Wj^ , • • • , Wj, are independent standard d-dimensional random normal 
vectors. By the tail probabilities of x^i^) distribution, 

{N}||^„, ||c{N}| 



Vpni^^^^ii >v ii^^^^ii>i/ 1 



(5.13) 
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To prove the lemma, it suffices to show that for 1 < j < / — 1 , 

(5.14) 5]P(||C^||>2/n,...,|lC^||>yn)=0(l). 

To keep notation brief, we assume S = {ii-j+i,. . . ,ii} for (ii,...,i;) £lj- 
Divide Xj into Iji and Ij2, where 

Iji = I 1 < ii < ■ ■ ■ < ii < m: there exists an A; G {ii , . . . , ii-j} 
such that for some ji,J2 S S with j'l ^ J2, r^j-^ > 



and r^j^ 



> 



(logm)i+T 
1 



(logm)i+'T 

and Ij2 =Ij \2ji. Then Card(Xji) = 0{m^~^^^^ ) and again by Lemma 5.2 
and the proof of Lemma 5.1, 

VPni9^^^ll>7y II -^^^^ll >v ) 



c{N} 11^., llciN} 

-j + 



< Vpni9^^^ II >v II9^^^II>?/ ^ 

= (1 + o(l)) j; P(|| W,,_^^, II > yJ^^, ..., II W,, II > yJ^^) 

For {ii,...,ii) G 2^2 and i/_j, there is only one ji £ S such that r^^.j^ > 
{\ogm)~^~'^ . For notation briefness, we can assume ji = i/_j+i. Thus, for 
any < e < 1, by Theorem 1 in Zaitsev (1987), 

PfllS^^^ II >V II9^^^II>7/ ^ 

^Vll'-'m,„jll ^Un,---, W^ni, II - Vn) 

(5.15) < P(|| W,,_,. II > (1 - e)yn/V^, ■■■, || Wi, || > (1 - e)yn/V^) 

+ ciexp(-C2(logm)i+(i-'^)/2)^ 

where ci and C2 only depend on d and e, (Wj^ ., . . . , WjJ are multivariate 
norm vector with covariance matrix Cov(5^^ ,,..., 5*^^ ). By the definition 
of Ij2, we can prove that 



i^-(s!Sl, s!S')-{^ ;) 

^ c ^ cr°""V'^"'''". 

(logm)^+'>' \ n J 
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where Tt = n^^YIklX''^ '^°''iiK'' ^K''^^)) and I is (j - l)d-dimensional 
identity matrix. It follows that 

^^UPnii W —Vn:- ■ ■ A\^nii W—Vn) 

< VPni9^^^ II >W Il9^^^ll>7y 1 

+ o(l). 
Since maxi<j<j<p ||rjj || < r, we have ||D|| < 1 + r. This yields that 

P(ll(W,,_^.,W,,_^-,JII>(l-e)^^W^^^) 
(5.16) 

<C(logm)'^/2-im-2(i-^)V(i+0. 

Since p is arbitrarily small, we can let e satisfy 2(1 — e)^/(l + r) > 1 + /)/. 
This proves that 

E PdlC^II ^ ^«' • • • ' llC^II ^ y-) = O(m^-+^'-^-+i-2(i-)V(i+0) = ,(1). 

Lemma 5.4 is proved. D 

5.4. Proof of Theorem 3.3. The proof of Theorem 3.3 is given in the 
supplement material [Liu and Shao (2013)]. 

5.5. Proof of Theorem 3.5. Let iq be the index such that 



11-^-1/2^ Ml II'S^-V2/ Ml ^ //o , Nlog"i 

II ^io (/^iio-/^2io)ll= max ||S. {^l^^ - ii^^)\\ > \ {2 + e) . 

Take ll^ll = 1 such that e'J:T^^'\ii^,^ - fi,J = \\^;,'^\t^u, - f^2^o)\\■ Note 
th at yn{a ) = 2 log m + {d — 2) log log m + qa + o{l). We have for any < e < 

\/r+^-i, 

P($: = l)>P(r4>y„(a)) 

> P^E^X" > {l + e)VU^j +o{l) 

> p|f;^'(Zt«-EZ'^0>(l + £)\/yn(a)ni-V(2 + e)nilogpJ 

+ o{l) 
^1. 
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SUPPLEMENTARY MATERIAL 

Supplement to "A Cramer moderate deviation theorem for Hotelling's 
T^-statistic with applications to global tests" (DOI: 10.1214/12-AOS1082SUPP; 
.pdf). The supplement material includes the moderate deviation result by 
Sakhanenko (1991), the proof of Theorem 3.3 and the simulation results in 
Section 4. 

REFERENCES 

Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. 
Wiley, Hoboken, NJ. MR1990662 

Arratia, R., Goldstein, L. and Gordon, L. (1989). Two moments suffice for Poisson 
approximations: The Chen-Stein method. Ann. Prohah. 17 9-25. MR0972770 

Cai, T., Liu, W. and XiA, Y. (2012). Two-sample test of high dimensional means under 
dependency. Technical report. 

Cao, J. and WORSLEY, K. J. (1999). The detection of local shape changes via the geom- 
etry of Hotelling's T^ fields. Ann. Statist. 27 925-942. MR1724036 

Chen, S. X. and Qm, Y.-L. (2010). A two-sample test for high-dimensional data with 
applications to gene-set testing. Ann. Statist. 38 808-835. MR2604697 

DE la Pena, V. H., Lai, T. L. and Shao, Q.-M. (2009). Self- Normalized Processes: 
Limit Theory and Statistical Applications. Springer, Berlin. MR2488094 

Dembo, a. and Shao, Q.-M. (2006). Large and moderate deviations for Hotelling's T^- 
statistic. Electron. Commun. Prohah. 11 149-159 (electronic). MR2240708 

DONOHO, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mix- 
tures. Ann. Statist. 32 962-994. MR2065195 

Fan, J., Hall, P. and Yao, Q. (2007). To how many simultaneous hypothesis tests can 
normal. Student's t or bootstrap calibration be applied? J. Amer. Statist. Assoc. 102 
1282-1288. MR2372536 

Gerardina, E., Chetelatd, G., Chupin, M., Cuingnet, R., Desgranges, B., 
KiME, H. S., NiETHAMMER, M., DUBOIS, B., Stephane Lehericy, S., Line Gar- 
NERO, L., EuSTACHE, F. and Colliot, O. (2009). Multidimensional classification of 
hippocampal shape features discriminates Alzheimer's disease and mild cognitive im- 
pairment from normal aging. Neurolmage 47 1476-1486. 

Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in 
correlated noise. Ann. Statist. 38 1686-1732. MR2662357 

Hall, P. and Wang, Q. (2010). Strong approximations of level exceedences related to 
muhiple hypothesis testing. Bernoulli 16 418-434. MR2668908 

JiNG, B.-Y., Shao, Q.-M. and Wang, Q. (2003). Self-normalized Cramer-type large 
deviations for independent random variables. Ann. Prohah. 31 2167-2215. MR2016616 

Lin, Z. and Liu, W. (2009). On maxima of periodograms of stationary processes. Ann. 
Statist. 37 2676-2695. MR2541443 

Liu, W.-D., Lin, Z. and Shao, Q.-M. (2008). The asymptotic distribution and Berry- 
Esseen bound of a new test for independence in high dimension with an application to 
stochastic optimization. Ann. Appl. Prohah. 18 2337-2366. MR2474539 



MODERATE DEVIATION FOR HOTELLING'S T^-STATISTIC 



27 



Liu, W. and Shao, Q.-M. (2010). Cramer-type moderate deviation for tiie maximum of 
the periodogram with application to simultaneous tests in gene expression time series. 
Ann. Statist. 38 1913-1935. MR2662363 

Liu, W. and Shao, Q. M. (2013). Supplement to "A Cramer moderate de- 
viation theorem for Hotelling's T^-statistic with applications to global tests." 
DOI:10.1214/12-AOS1082SUPP. 

Martens, J. W. M., Nimmrich, L, Koenig, T., Look, M. P., Harbeck, N., 
Model, F., Kluth, A., de Vries, J. B., Sieuwerts, A. M., Portengen, H., Meijer- 
Van Gelder, M. E., Piepenbrock, C, Olek, A., Hofler, H., Kiechle, M., 
Klijn, J. G. K., SCHMITT, M., Maier, S. and Foekens, J. A. (2005). Association 
of DNA methylation of Phosphoserine Aminotransferase with response to endocrine 
therapy in patients with recurrent breast cancer. Cancer Research 65 4101-4117. 

Sakhanenko, a. I. (1991). Berry-Esseen type estimates for large deviation probabilities. 
Stb. Math. J. 32 647-656. 

Shao, Q.-M. (1999). A Cramer type large deviation result for Student's i-statistic. J. The- 
oret. Probab. 12 385-398. MR1684750 

Storey, J. D. and Tibshirani, R. (2001). Estimating false discovery rates under depen- 
dence, with applications to DNA microarrays. Technical report. 

Styner, M., Oguz, I., Xu, S., Brechbuhler, C., Pantazis, D., Levitt, J. J., Shen- 
TON, M. E. and Gerig, G. (2006). Framework for the statistical shape analysis of brain 
structures using SPHARM-PDM. Insight .Journal 1071 242-250. 

Taylor, J. E. and Worsley, K. J. (2008). Random fields of multivariate test statistics, 
with applications to shape analysis. Ann. Statist. 36 1-27. MR2387962 

Zaitsev, a. Y. (1987). On the Gaussian approximation of convolutions under multidi- 
mensional analogues of S. N. Bernstein's inequality conditions. Probab. Theory Related 
Fields 74 535-566. MR0876255 

Zhao, Z., Taylor, W. D., Styner, M., Steepens, D. C., Krishnan, K. R. R. and 
MacFall, J. R. (2008). Hippocampus shape analysis and late-life depression. PLoS 
One 3 el837. 



Department of Mathematics 
Institute of Natural Sciences 
Shanghai ,Jiao Tong University 
Shanghai, P.R. China 
E-MAIL: liuwcidong99@gmail.com 



Department of Statistics 

Chinese University of Hong Kong 

Shatin, N.T. 

Hong Kong, P.R. China 

E-MAIL: qmsiiao@cuhk.edu. hk 



